SVM-RFE peak selection for cancer classification with mass spectrometry data

نویسندگان

  • Kaibo Duan
  • Jagath C. Rajapakse
چکیده

We studied two cancer classification problems with mass spectrometry data and used SVM-RFE to select a small subset of peaks as input variables for the classification. Our study shows that, SVM-RFE can select a good small subset of peaks with which the classifier achieves high prediction accuracy and the performance is much better than with the feature subset selected by T-statistics. We also found that, the best peak subset selected by SVM-RFE always have in the top ranked peaks by T-statistics while it includes some peaks that are ranked low by T-statistics. However, these peaks together give much better classification performance than the same number of most top ranked peaks by T-statistics. Our experimental comparison of the performance of Support Vector Machine classification algorithm with and without peak selection also consolidates the importance of peak selection for cancer classification with mass spectrometry data. Selecting a small subset of peaks not only improves the efficiency of the classification algorithms, but also improves the cancer classification accuracy, even for classification algorithms like Support Vector Machines, which are capable of handling large number of input variables.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Diagnosis of early relapse in ovarian cancer using serum proteomic profiling.

Surface-enhanced laser desorption/ionization time-of-flight (SELDI-TOF) mass spectrometry data has been increasingly analyzed for identifying biomarkers to help early detection of the disease. Ovarian cancer commonly recurs at the rate of 75% within a few months or several years later after standard treatment. Since recurrent ovarian cancer is relatively difficult to be diagnosed and small tumo...

متن کامل

Recursive feature elimination with random forest for PTR-MS analysis of agroindustrial products

In this paper we apply the recently introduced Random Forest-Recursive Feature Elimination (RF-RFE) algorithm to the identification of relevant features in the spectra produced by Proton Transfer Reaction-Mass Spectrometry (PTR-MS) analysis of agroindustrial products. The method is compared with the more traditional Support Vector Machine-Recursive Feature Elimination (SVM-RFE), extended to all...

متن کامل

A Novel SVM-RFE for Gene Selection∗

Selecting a subset of informative genes frommicroarray expression data is a critical data preparation step in cancer classification and other biological function analysis. The support vector machine recursive feature elimination (SVM-RFE) is one of the most effective feature selection method which has been successfully used in selecting informative genes for cancer classification. While, the SV...

متن کامل

Gene selection for cancer classification using the combination of SVM-RFE and GA

Gene selection is a key research issue in molecular cancer classification and identification of cancer biomarkers using microarray data. Support vector machine recursive feature elimination (SVM-RFE) is a well known algorithm for this purpose. In this study, a novel gene selection algorithm is proposed to enhance the SVM-RFE method. The proposed approach is designed to use the combination of SV...

متن کامل

Semi-supervised SVM-based Feature Selection for Cancer Classification using Microarray Gene Expression Data

Gene expression data always suffer from the high dimensionality issue, therefore feature selection becomes a fundamental tool in the analysis of cancer classification. Basically, the data can be collected easily without providing the label information, which is quite useful in improving the accuracy of the classification. Label information usually difficult to obtain as the labelling processes ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005